Learning OOV through semantic relatedness in spoken dialog systems

نویسندگان

  • Ming Sun
  • Yun-Nung Chen
  • Alexander I. Rudnicky
چکیده

Ensuring language coverage in dialog systems can be a challenge, since the language in a domain may drift over time, creating a mismatch between the original training data and current input. This in turn degrades performance by increasing misunderstanding and eventually leading to task failure. Without the capability of adapting the vocabulary and the language model based on certain domains or users, recognition errors may degrade the understanding performance, and even lead to a task failure, which incurs more time and effort to recover. This paper investigates how coverage can be maintained by automatically acquiring potential out-of-vocabulary (OOV) words by leveraging different types of relatedness between vocabulary items and words retrieved from web-based resources. Our experiments show that both recognition and semantic parsing accuracy can thereby be improved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combining Semantic Word Classes and Sub-Word Unit Speech Recognition for Robust OOV Detection

Out-of-vocabulary words (OOVs) are often the main reason for the failure of tasks like automated voice searches or humanmachine dialogs. This is especially true if rare but task-relevant content words, e.g. person or location names, are not in the recognizer’s vocabulary. Since applications like spoken dialog systems use the result of the speech recognizer to extract a semantic representation o...

متن کامل

Recognition of Out-of-vocabulary Words and Their Semantic Category

In almost all applications of automatic speech recognition, especially in spontaneous speech tasks, the recognizer vocabulary cannot cover all occurring words. There is always a signiicant amount of out-of-vocabulary (OOV) words even when the vocabulary size is very large. In this paper we present a new approach for the integration of OOV words into statistical language models. It is based on t...

متن کامل

Semantic processing of out-of-vocabulary words in a spoken dialogue system

One of the most important causes of failure in spoken dialogue systems is usually neglected: the problem of words that are not covered by the system’s vocabulary (out-of-vocabulary or OOV words). In this paper a methodology is described for the detection, classification and processing of OOV words in an automatic train timetable information system [2]. The various extensions that had to be effe...

متن کامل

Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures

Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...

متن کامل

Presentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures

Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015